38 research outputs found

    Optimization-based modeling of Lombard speech articulation:Supraglottal characteristics

    Get PDF
    This paper shows that a highly simplified model of speech production based on the optimization of articulatory effort versus intelligibility can account for some observed articulatory consequences of signal-to-noise ratio. Simulations of static vowels in the presence of various background noise levels show that the model predicts articulatory and acoustic modifications of the type observed in Lombard speech. These features were obtained only when the constraint applied to articulatory effort decreases as the level of background noise increases. These results support the hypothesis that Lombard speech is listener oriented and speakers adapt their articulation in noisy environments.</p

    Fluency-related Temporal Features and Syllable Prominence as Prosodic Proficiency Predictors for Learners of English with Different Language Backgrounds

    Get PDF
    Prosodic features are important in achieving intelligibility, comprehensibility, and fluency in a second or foreign language (L2). However, research on the assessment of prosody as part of oral proficiency remains scarce. Moreover, the acoustic analysis of L2 prosody has often focused on fluency-related temporal measures, neglecting language-dependent stress features that can be quantified in terms of syllable prominence. Introducing the evaluation of prominence-related measures can be of use in developing both teaching and assessment of L2 speaking skills. In this study we compare temporal measures and syllable prominence estimates as predictors of prosodic proficiency in non-native speakers of English with respect to the speaker's native language (L1). The predictive power of temporal and prominence measures was evaluated for utterance-sized samples produced by language learners from four different L1 backgrounds: Czech, Slovak, Polish, and Hungarian. Firstly, the speech samples were assessed using the revised Common European Framework of Reference scale for prosodic features. The assessed speech samples were then analyzed to derive articulation rate and three fluency measures. Syllable-level prominence was estimated by a continuous wavelet transform analysis using combinations of F0, energy, and syllable duration. The results show that the temporal measures serve as reliable predictors of prosodic proficiency in the L2, with prominence measures providing a small but significant improvement to prosodic proficiency predictions. The predictive power of the individual measures varies both quantitatively and qualitatively depending on the L1 of the speaker. We conclude that the possible effects of the speaker's L1 on the production of L2 prosody in terms of temporal features as well as syllable prominence deserve more attention in applied research and developing teaching and assessment methods for spoken L2.Peer reviewe

    The role of duration and pitch in signaling quantity in Finnmark North Sámi

    Get PDF
    Ternary quantity opposition is a cross-linguistically extremely rare typological feature. One of the languages using ternary opposition of consonants to signal linguistic contrasts is North Sámi, an endangered language spoken in several countries in the northernmost Scandinavia. Previous studies have shown that while the contrast between the two shorter quantity degrees is phonetically robustly realized using segmental durations, phonetic differences between the two longer degrees are much more subtle and show a considerable regional variation. In this work we investigate other prosodic means that might be used to mark the contrast alongside duration, namely f0 movement and range. We show that the North Sámi speakers that are also native speakers of Norwegian use pitch to co-signal the differences between the two higher quantity degrees, while speakers that are Finnish-North Sámi bilinguals use primarily durational cues. Interpreting these findings in the light of prosodic characteristics of the majority languages (Finnish and Norwegian) we argue that these regional differences reflect the majority language influence which can be a source of the ongoing dialectal divergence, and potential language change.Peer reviewe

    Optimal control of speech with context-dependent articulatory targets

    Get PDF
    This paper presents a computational implementation of phonetic planning which consists of choosing the position of articulatory targets which satisfy conflicting linguistic and extra-linguistic requirements. We present a minimal model that considers intelligibility and least effort as task requirements. To achieve the context-dependent variability of targets, our model approximates intelligibility as a function of target phoneme recognition probability given a vector of articulatory parameters. Preliminary experiments show that our minimal computational model of phonetic planning is able to predict two types of hypoarticulation by adjusting the weight assigned to effort: vowel centralization and stop consonant lenition.Peer reviewe

    Effect of feeding of different sources of NPN on production performance of dairy cows.

    Get PDF
    Received: 2016-04-11 | Accepted: 2016-05-04 | Available online: 2016-12-22http://dx.doi.org/10.15414/afz.2016.19.04.163-166The aim of the study was to analyse the effect of feeding of different sources of NPN on nutrient utilization and production performance of dairy cows under field conditions. Balancing diets for crude protein without consideration of protein quality or rumen degradability often led to overfeeding of nitrogen and less than optimum production. High yielding dairy cows separated in two groups with 85 resp. 80 cows in each were set up for the trial. Groups were consistent according the stage of production and reproduction cycle as well as age structure. Both groups were fed concentrate mixture with the same composition with only difference in NPN/ microbial protein source, with same dosage of 100 g per cow and day. Field trial was performed for period of 3 subsequent months. Performance data were collected in accordance with official milk recording. In both groups majority of cows were on first lactation. Significant differences in daily milk production were observed 2.87 kg (P<0.01) for group 2, in fat content 0.07 % for group 2 non-significant, whereas in protein content 0.18% for group 1 significant (P<0.01) in case of first lactations. If considering  first tree lactations, group 2 produced 1.7 kg milk per day more (P<0.08), with 0.05% fat more and 0.002 % protein less than group 1. The space created in dry matter intake by a concentrated slow-release NPN can be filled with high quality forage that could reduce the cost of feeding while maintaining levels of production. Keywords: Holstein, slow-release urea, microbial protein, milk yieldReferences Bíro, D., Gálik, B., Juráček, M. et al. (2009) Effect of Biological and Biochemical Silage Additives on Final Nutritive, Hygienic and Fermentation Characteristics of Ensiled High Moisture Crimped Corn. Acta Veterinaria Brno, vol. 78 (4), pp. 691-698 doi: http://dx.doi.org/10.2754/avb200978040691Bouška J. et al. (2006) Chov dojeného skotu, Profi Press, Praha, 2006Cantalapiedra-Hijar, G., Peyraud, J. L., Lemosquet, S. et al. (2014) Dietary carbohydrate composition modifies the milk N efficiency in late lactation cows fed low crude protein diets. Animal, vol. 8 (2), pp. 275-285 doi: http://dx.doi.org/10.1017/S1751731113002012Cappellozza, B. I., Bohnert, D. W., Schauer, C. S. et al.  (2013) Daily and alternate day supplementation of urea or soybean meal to ruminants consuming low-quality cool-season forage: II. Effects on ruminal fermentation Livestock Science, vol. 155 (2-3), pp. 214-222 doi: http://dx.doi.org/10.1016/j.livsci.2013.05.002De Boever, J. L., Blok, M. C., Millet, S. et al. (2014) The energy and protein value of wheat, maize and blend DDGS for cattle and evaluation of prediction methods. Animal, vol. 8(11), pp 1839–1850  doi: http://dx.doi.org/10.1017/S1751731114001815Harrison, G. A. and Karnezos T. P. (2005) Can we improve efficiency of nitrogen utilization in the lactating cow? Recent Advances in Animal Nutrition, vol. 15, 2005, 001-011pp.Hazuchová E. and Kasarda R. (2010) Evaluation of body condition score of lactating cows. 61st EAAP Annual Meeting, Heraklion. 2010. Book of Abstracts. 34, 26,p. 375Holder  Vaughn B., El-Kadi, Samer W., Tricarico, Juan M. et al. (2013) The effects of crude protein concentration and slow release urea on nitrogen metabolism in Holstein steers. Archives of Animal Nutrition, vol. 67 (2), pp. 93-100 doi: http://dx.doi.org/10.1080/1745039X.2013.773647Kudrna V. and Homolka P. (2009) Vliv diety, zejména obsahu dusíkatých látek, na množství a kvalitu mléčné bílkoviny a zdraví dojnic, Výskumný ústav živočišné výroby, Praha – Uhříněves, 2009McGuire D. L., Bohnert, D. W., Schauer, C. S. et al. (2013) Daily and alternate day supplementation of urea or soybean meal to ruminants consuming low-quality cool-season forage: I-Effects on efficiency of nitrogen use and nutrient digestion  Livestock Science, vol. 155, (2-3), pp. 205-213 doi: http://dx.doi.org/10.1016/j.livsci.2013.05.015Šimko, M., Čerešňáková, Z. Bíro, D. et al. (2010) Influence of Wheat and Maize Starch on fermentation in the Rumen, Duodenal Nutrient Flow and Nutrient Digestibility. ActaVeterinaria Brno, vol. 79 (4), pp. 533-541 doi: http://dx.doi.org/10.2754/avb201079040533Zeman, L. et al.(2006)  Výživa a krmení hospodářských zvířat., Profi Press, Praha, 2006

    Dialectal variation of duration patterns in Finnmark North Sámi quantity

    Get PDF
    Ternary length contrast is a rare phonological feature, investigated here both in terms of its realization and possible undergoing changes. In North Sami, a phonetically under-documented and endangered Fenno-Ugric language spoken by indigenous people in Northern Europe, the ternary quantity contrast is assumed to be signalled by a progressive lengthening of a consonant and a compensatory shortening of the previous vowel. This study evaluates this assumption and compares the realization of the length contrasts in two dialects, the Western and Eastern Finnmark North Sami. The results show that while the contrast between the short and the two longer quantities is robustly signaled regardless of the dialect, the durational differences between the two longer quantities are maintained only in the Eastern dialect. On the other hand, a vowel quantity contrast independent of the quantity of the following consonant is present in the Western but not in the Eastern dialect. Further, comparing the phonetic realization of the ternary quantity contrast for speakers of different ages presents evidence of a language change: the results indicate an ongoing neutralization of the ternary contrast in younger speakers, which points to a possible disappearance of this rare typological feature in Finnmark North Sami.Peer reviewe

    Analysis of speech prosody using WaveNet embeddings : The Lombard effect

    Get PDF
    We present a novel methodology for speech prosody research based on the analysis of embeddings used to condition a convolutional WaveNet speech synthesis system. The methodology is evaluated using a corpus of Lombard speech, pre-processed in order to preserve only prosodic characteristics of the original recordings. The conditioning embeddings are trained to represent the combined influences of three sources of prosodic variation present in the corpus: the level and type of ambient noise, and the sentence focus type. We show that the resulting representations can be used to quantify the prosodic effects of the underlying influences, as well as interactions among them, in a statistically robust way. Comparing the results of our analysis with the results of a more traditional examination indicates that the presented methodology can be used as an alternative method of phonetic analysis of prosodic phenomena.Peer reviewe

    Prosodic Representations of Prominence Classification Neural Networks and Autoencoders Using Bottleneck Features

    Get PDF
    Prominence perception has been known to correlate with a complex interplay of the acoustic features of energy, fundamental frequency, spectral tilt, and duration. The contribution and importance of each of these features in distinguishing between prominent and non-prominent units in speech is not always easy to determine, and more so, the prosodic representations that humans and automatic classifiers learn have been difficult to interpret. This work focuses on examining the acoustic prosodic representations that binary prominence classification neural networks and autoencoders learn for prominence. We investigate the complex features learned at different layers of the network as well as the 10-dimensional bottleneck features (BNFs), for the standard acoustic prosodic correlates of prominence separately and in combination. We analyze and visualize the BNFs obtained from the prominence classification neural networks as well as their network activations. The experiments are conducted on a corpus of Dutch continuous speech with manually annotated prominence labels. Our results show that the prosodic representations obtained from the BNFs and higher-dimensional non-BNFs provide good separation of the two prominence categories, with, however, different partitioning of the BNF space for the distinct features, and the best overall separation obtained for F0.Peer reviewe

    Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis

    Get PDF
    Recent advances in deep learning methods have elevated synthetic speech quality to human level, and the field is now moving towards addressing prosodic variation in synthetic speech.Despite successes in this effort, the state-of-the-art systems fall short of faithfully reproducing local prosodic events that give rise to, e.g., word-level emphasis and phrasal structure. This type of prosodic variation often reflects long-distance semantic relationships that are not accessible for end-to-end systems with a single sentence as their synthesis domain. One of the possible solutions might be conditioning the synthesized speech by explicit prosodic labels, potentially generated using longer portions of text. In this work we evaluate whether augmenting the textual input with such prosodic labels capturing word-level prominence and phrasal boundary strength can result in more accurate realization of sentence prosody. We use an automatic wavelet-based technique to extract such labels from speech material, and use them as an input to a tacotron-like synthesis system alongside textual information. The results of objective evaluation of synthesized speech show that using the prosodic labels significantly improves the output in terms of faithfulness of f0 and energy contours, in comparison with state-of-the-art implementations.Peer reviewe
    corecore